05. The OpenCV Computer Vision Library
The OpenCV Computer Vision Library
ND313 C03 L01 A08 C15 Intro
Throughout this course, you will be using the OpenCV , which is a cross-platform computer vision library which was originally developed in the year 2000 to provide a common infrastructure for computer vision applications and to accelerate the use of machine vision in science and engineering projects. Originally founded by Intel, the open-source library is now supported by several companies and hundreds of experts all over the globe.
The library has more than 2500 algorithms that can be used to detect and recognize faces, identify objects, classify human actions in videos, track camera movements, track moving objects, perform machine learning and many more. OpenCV is written natively in C++ but has interfaces to Python, Java and Matlab as well. In this course, you will be using the C++ version of OpenCV.
The major advantage in using the OpenCV library is that you will be able to leverage a well-tested set of state-of-the-art computer vision algorithms. Without having to concentrate on the actual implementation of computer vision concepts such as Sobel operators, keypoint detection or machine learning you can use them right out of the box and concentrate on combining them in the right way to develop a working software prototype. Despite this ease of use however, a good understanding of the theories behind those concepts is needed to use them correctly.
In the following, you will familiarize yourself with some basic concepts you will need to get started with OpenCV and to prepare yourself for the more advanced lessons later in the course. The libraries listed below will be used extensively throughout this lecture. They are however only a small part of the entire OpenCV. Later, you will also include some specialized libraries such as flann (Fast Library for Approximate Nearest Neighbors) or dnn (Deep Neural Networks), which will be described only in those sections of this course where they are used.
A note on namespaces: Most OpenCV functions exist within the cv namespace. Usually, to shorten the code, the using namespace cv command is used in many applications. In this course however, this is not done to make it clear when we are using function calls from the OpenCV.
OpenCV Library Overview
The core module is the section of the library that contains all of the basic object types and their operations. To use the library in your code, the following header has to be included:
#include "opencv2/core/core.hpp"
The highgui module contains user interface functions that can be used to display images or take simple user input. To use the library in your code, the following header has to be included:
#include "opencv2/highgui/highgui.hpp"
In this project, basic functions such as
cv::imshow
will be used to display images in a window.
The imgproc (image processing) module contains basic transformations on images, such as image filtering, geometric transformations, feature detection and tracking. To use the library in your code, the following header has to be included:
#include "opencv2/imgproc/imgproc.hpp"
The features2d module contains algorithms for detecting, describing, and matching keypoints between images. To use the library in your code, the following header has to be included:
#include "opencv2/features2d/features2d.hpp"
The OpenCV Matrix Datatype
The basic data type in OpenCV to store and manipulate images is the
cv::Mat datatype
. It can be used for arrays of any number of dimensions. The data stored in
cv::Mat
is arranged in a so-called
raster scan order
. For a two-dimensional array (such as a grayscale image), this means that the data is organized into rows, and each row appears one after the other. A three-dimensional array (e.g. a color image) is arranged in planes, where each plane is filled out row by row, and then the planes are packed one after the other. To see how this works, let us look into the
cv::Mat
datatype more deeply:
The data inside a
cv::Mat
variable can be either single numbers or multiple numbers. In the case of multiple numbers (e.g. represented by
cv::Scalar
), the matrix is referred to as a multichannel array. There are several ways to create and initialize a
cv::Mat
variable. The
create_matrix.cpp
file in the workspace below illustrates one way how this can be done.
Note: To build and run the code below, use the following steps:
-
Go to the virtual Desktop by clicking the
Desktop
button. You can use Terminator or a VSCode terminal to run the following commands: -
From the
/home/workspace/OpenCV_exercises
directory, run the commands:mkdir build && cd build
-
cmake ..
-
make
-
Run the
create_matrix
executable frombuild
with the command:./create_matrix
Workspace
This section contains either a workspace (it can be a Jupyter Notebook workspace or an online code editor work space, etc.) and it cannot be automatically downloaded to be generated here. Please access the classroom with your account and manually download the workspace to your local machine. Note that for some courses, Udacity upload the workspace files onto https://github.com/udacity , so you may be able to download them there.
Workspace Information:
- Default file path:
- Workspace type: react
- Opened files (when workspace is loaded): n/a
-
userCode:
export CXX=g++-7
export CXXFLAGS=-std=c++17
In the code example, the variable
m18u
is created with 480 rows and 640 columns with a color depth of 8 bit as unsigned char and a single channel (hence the _8UC1). Then, the entire image is set to the 8bit maximum value of 255, which corresponds to white. The function
cv::imshow
displays the image on the screen. When you execute the code, you should see a white image appear in a window on the screen.
Matrices in OpenCV can also be created with three channels to represent color.
Here is a short task for you
: In the
create_matrix.cpp
file, create a variable of type
cv::Mat
named
m3_8u
which has three channels with a depth of 8bit per channel. Then, set the first channel to 255 using the
cv::Scalar
datatype and display the result. You can use the documentation
here
if you get stuck.
Exercise
SOLUTION:
BlueManipulating Matrices
Now that you can create matrices, let us try to change some of their entries: By using the command
cv::Mat::at<data type>(row, col) = data
the element at the given position can be replaced with data. Please note that the data type you provide to the
at
-function has to match the actual data stored in the matrix you are trying to access.
Here is another short task for you
: In the
change_pixels.cpp
file, write a nested loop that runs over the entire width of the matrix in the example below. Then, set every element to 255. Take special care to select the correct data type for the given format. What does the resulting image look like?
Note:
You can build and run your code for this task using the same steps as above, except for this exercise, the executable will be named
change_pixels
.
Manipulating Matrices
SOLUTION:
A white bar from left to right.Loading and Handling Images
The next thing we want to do is to load an image from file. Let us assume that the image resides in the same path as the executable. By calling
cv::imread
we can load the image from file and assign it to a
cv::Mat
variable. Take a look at the following code example to see how a single image can be loaded from file. You can build the code as above, and you can run the code from the virtual desktop using the
load_image_1
executable.
Assuming that there are 5 images in total in the code directory (img0005.png - img0009.png) , they can easily be read from file one after the other using string concatenation. The next example shows how the filename can be easily assembled from single elements using string concatenation and the setfill-function, which ensures that the prepending zeros are added to the loop variable before appending it to the filename. You can run the next example using the
load_image_2
executable.
Later in the course, we will load and process several images one after the other. It is important to handle large amounts of data in a smart way so that images and other structures are not needlessly copied. Also, we want to flexibly rearrange data as well as delete and append elements on a regular basis. In C++, this can easily be achieved by using vectors. In the following code, a set of images is loaded from file as before and pushed into a dynamic list of type
vector<cv::Mat>
. Then, an iterator is used to loop over the list and display the loaded images one by one.
You can run the code below using the
load_image_3
executable.
The
auto
keyword is simply asking the compiler to deduce the type of the variable from the initialization, which is much more convenient than writing
vector<cv::Mat>::iterator it
instead. The current image within the loop can be accessed by using the
*it
expression.
Here is a last exercise for you
: In the loop of
load_image_3.cpp
, prevent image number 7 from being displayed.
Summary
ND313 C03 L01 A09 C15 Outro